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Abstract  of  Dissertation  Presented  to  the  Graduate  Council 
in  Partial  Fulfillment  of  the  Requirements  for  the 
Degree  of  Doctor  of  Philosophy 


SEQUENTIAL  SUBOPTIMAL  ADAPTIVE  CONTROL 
OF  NONLINEAR  SYSTEMS 

By 

Thomas  Walter  Ellis 
December,  1966 


Chairman:  Dr.  A.  P.  Sage 

Major  Department:  Electrical  Engineering 

In  this  dissertation  two  methods  of  sequential  suboptimal 
adaptive  control  are  presented  which  encompass  both  identification 
and  control.  |.A  generally  nonlinear  differential  system  is  modeled 
by  a linear  time-varying  system  of  assumed  form  and  of  possibly  lower 
dimension.  This  system  is  assumed  stationary  over  subintervals  of  time 
■which  allows  a controller  to  generate  a sequential  control  law  which 
minimizes  a quadratic  performance  index.  The  use  of  a linear  system 
model  and  a quadratic  performance  index  allows  the  controller  to 
operate  in  an  on-line  fashion  due  to  the  speed  and  ease  at  which  the 
control  can  be  calculated  and  applied. 

' The  first  control  philosophy  presented  is  a form  of  regulator 
control  which  enables  the  system  to  adapt  to  new  trajectories  as  the 
system  undergoes  modifications,  or  is  affected  by  noise  or  environ- 
mental changes.  The  given  time  interval  of  interest,  te(t  ,t^),  is 
divided  into  N subintervals,  te^,^  ),  with  i = 0,1,..., N-l,  and 

a system  model  is  determined  at  each  t = t^.  Then  a suitable  control 


vi 


is  found  by  the  maximum  principle  which  minimizes  an  integral  of  time 
weighted  quadratic  form  of  error  and  control  effort  over  the  time  inter- 
val te(t.  ,t„) . 

x f 

The  second  control  philosophy  is  a form  of  trajectory  control 
which  forces  the  system  to  track  a predetermined  desired  trajectory. 

Again  the  time  interval  of  interest,  te(tQ,t^),  is  divided  into  N sub- 
intervals, and  a system  model  is  determined  at  each  t = t. . The  max- 
imum principle  is  then  utilized  to  determine  the  control  which  will 
drive  the  system  sufficiently  close  to  the  desired  trajectory  at  t = t^+^. 

Alternate  means  are  given  for  the  choice  of  a proper  model, 
and  invariant  imbedding  is  presented  as  a means  for  parameter  identifi- 
cation and  state  estimation  which  is  particularly  suited  for  on-line 
use.  It  is  assumed  that  the  problems  of  parameter  identification, 
state  estimation,  and  control  may  be  decoupled. 

4<  Several  applications  of  these  techniques  are  given.  The  control 
of  a nuclear  reactor  during  startup  in  the  presence  of  input  and  output 
noise  is  presented,  along  with  a suboptimal  guidance  and  control  scheme 
for  the  low  thrust  orbital  transfer  problem  which  attempts  to  minimize 
fuel  consumption,  also  in  the  presence  of  noise.  Finally,  the  control 
of  a nuclear  rocket  engine  during  startup  is  given  along  with  reactivity 
profiles  for  several  desirable  temperature  trajectories. 


CHAPTER  I 


INTRODUCTION 

Much  recent  attention  has  been  given  to  the  solution  of  optimal 
control  problems  for  nonlinear  systems.  This  effort  has  resulted  in  a 
variety  of  methods  for  the  computational  solution  of  nonlinear  two- 
point  boundary -value  problems  [1,2],  In  these  boundary -value  problems 
half  of  the  boundary  conditions  are  specified  at  the  final  time.  This 
implies  that  an  a priori  knowledge  of  the  complete  system  dynamics  must 
be  known  over  the  time  interval  of  operation  te(tQ>t^).  Thus,  in  a 
large  number  of  cases,  solution  of  an  optimal  nonlinear  control  prob- 
lem results  in  the  determination  of  an  open  loop  control  for  a system 
with  known  dynamics  over  the  time  interval  of  operation.  In  many 
instances,  a closed  loop  control  is  desired.  Also,  if  there  are  proc- 
ess variations,  environmental  changes,  or  uncertainties  in  the  system 
model,  the  complete  knowledge  of  system  dynamics  necessary  to  predeter- 
mine the  open  loop  control  cannot  be  obtained.  Furthermore,  the  pres- 
ence of  noise  can  possibly  reduce  the  effectiveness  of  a predetermined 
control.  For  the  case  of  a deterministic  linear  system  with  known 
constant  coefficients  and  a quadratic  cost  function  the  closed  loop 
control  can  be  obtained  with  relative  ease  [3].\ 

Pearson  [U]  recognizes  this  situation  and  proposes  a closed 
loop  suboptimal  controller  for  nonlinear  systems.  It  is  derived  from 
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the  stable  steady- state  solution  of  the  Ricatti  equation  which  results 
from  ordinary  variational  techniques.  The  nonlinear  and  nonstationary 
system  is  optimized  with  respect  to  a quadratic  performance  index  by 
treating  it  as  an  instantaneously  linear  stationary  system.  Although 
the  method  has  some  merit,  it  has  some  obvious  drawbacks.  Namely,  to 
generate  a constant  feedback  controller,  the  time  interval  of  interest 
must  be  large  compared  with  system  time  constants.  Also,  in  solving 
for  the  stable  steady-state  solution  of  the  Ricatti  equation,  it  is 
possible  that  existence  and/or  uniqueness  difficulties  can  arise  [f>]. 
Finally,  it  can  be  shown  that  the  success  of  the  method  depends  strongly 
on  the  particular  system,  and  if  the  linearized  model  varies  radically, 
the  results  can  be  poor. 

Kishi  [6]  has  attempted  to  develop  an  on-line  control  scheme 
for  linear  systems.  The  time  interval  of  interest  is  divided  into 
subintervals  and  at  each  of  these  subintervals  an  open  loop  control 
is  calculated  by  minimizing  the  performance  index  over  the  subinterval. 
This  is  similar  to  what  Sage  and  Eisenberg  [7]  show  although  in  this 
case  nonlinear  systems  are  treated,  and  the  open  loop  control  calculated 
at  each  subinterval  is  based  on  minimizing  the  performance  index  over 
the  remaining  time  to  go.  Each  of  these  methods  works  well  for  systems 
with  nonlinearities  in  which  the  control  has  little  effect,  but  in 
general,  they  would  not  be  satisfactory,  particularly  where  measurement 
noise  is  present. 

This  dissertation  attempts  to  develop  and  provide  experimental 
justification  for  an  alternate  approach  to  the  sequential  adaptive 
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control  of  noisy  nonlinear  systems  based,  in  part,  on  the  identification 
of  a linear  model  and  use  of  the  real  time  computational  simplicity  of 
linear  systems  with  quadratic  cost  functions.  To  be  more  specific,  an 
attempt  is  made  to  develop  a means  of  on-line  control  which  can  be  com- 
puted rapidly  due  to  the  identification  of  a linear  model  for  the  plant, 
and  which  introduces  feedback  by  sequentially  monitoring  the  system  at 
discrete  time  instants  and  updating  the  control, 
a linear  time-varying  system  which  is  assumed  stationary  over  subinter- 
vals of  time,  thus  allowing  a controller  to  generate  a sequential  con- 
trol law  which  minimizes,  not  the  given  performance  index,  but  a closely 
related  one.  The  resulting  control  is  of  course  only  an  approximation 
to  the  predetermined  optimum  control.  However,  due  to  noise,  and  proc- 
ess and  environmental  changes  which  cannot  be  foreseen,  it  may  sub- 
stantially reduce  the  cost  over  that  which  results  from  a predeter- 
mined open  loop  control!'  Even  though  the  adoption  of  a suboptimal 
policy  may  result  at  times  in  a slight  reduction  in  system  performance, 
it  is  important  to  realize  that  perhaps  the  simplified  calculation  and 
utilization  of  the  control  might  result  in  the  suboptimal  policy  being 
optimal  in  a certain  enlarged  performance  index. 

The  selection  of  a proper  model  is  very  important,  and  Chapter  U 
outlines  alternate  means  for  its  choice.  Invarient  imbedding  is  intro- 
duced as  a means  for  parameter  identification  and  state  estimation  which 
is  particularly  suited  for  on-line  use  [8].  It  is  assumed  that  the  prob- 
lems of  parameter  identification,  state  estimation,  and  control  may  be 
decoupled,  although  this  does  not  result  in  a truly  optimal  system  as 
shown  by  Sworder  [9]. 


The  model  chosen  is 


In  Chapters  III  and  IV,  two  methods  for  computing  on-line  con- 
trol are  presented.  The  first  method  enables  the  system  to  adapt  to 
new  trajectories  as  the  system  undergoes  modifications.  The  second 
method  attempts  to  keep  the  system  tracking  a precalculated  trajec- 
tory. The  choice  of  methods  depends  upon  the  particular  situation  and 
the  type  of  control  desired. 

In  Chapter  V,  applications  of  the  above  methods  are  given. 
Specifically,  the  startup  of  a nuclear  reactor,  the  startup  of  a nuclear 
rocket  engine,  and  a low  thrust  orbital  transfer  problem  are  given. 

These  examples  were  chosen  since  they  are  of  practical  importance  and 


current  interest. 
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CHAPTER  II 


MODELS,  I DEN TIRE  CATION  AND  STATE  ESTIMATION 

In  optimal  control  theory  it  is  desired  to  minimize  a given 
cost  function  or  performance  index  while  controlling  a system  whose 
dynamic  characteristics  are  given  in  differential  or  difference  form 
relating  the  state  variables  and  control  variables..  For  systems  whose 
dynamics  are  deterministic  and  completely  known,  the  necessary  condi- 
tions for  optimal  performance  can  be  established  in  a formal  manner 
once  a cost  function  is  specified.  However,  in  many  practical  situa- 
tions the  dynamics  of  the  system  may  be  complex  and  vary  in  an  unpre- 
dictable fashion.  [Thus  it  is  necessary  to  incorporate  an  identifica- 
tion scheme  which  will  sequentially  update  information  on  the  system 
dynamics.;  'Furthermore,  in  some  cases  either  the  mathematical  descrip- 
tion of  the  system  dynamics  may  be  unknown  or  perhaps  the  known  system 
dynamics  are  of  such  complexity  as  to  prohibit  on-line  computation. 
Then  it  is  often  convenient  to  use  a model  of  assumed  form  which  may 
be  less  complicated  than  the  actual  system? 

This  chapter  attempts  to  present  methods  for  choosing  a satis- 
factory model  along  with  an  appropriate  identification  scheme  which  is 
suitable  for  use  in  a stochastic  environment.  The  presence  of  noise 
demands  a state  estimation  procedure,  which,  as  shown  later  in  the 
chapter,  can  be  combined  with  the  identification  process. 
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By  using  this  method  it  is  implied  that  the  estimation  of  the 
state  variables  and  unknown  parameters  is  separated  from  the  derivation 
of  the  control.  That  is,  once  the  model  is  identified  and  the  state  of 
the  system  is  estimated,  the  information  obtained  is  used  to  derive  the 
appropriate  control  law.  This  type  of  approach,  as  shown  in  Figure  1, 
is  referred  to  as  an  "ideal  adaptive"  system  by  Kalman  [1], 

It  is  known  that  if  the  controller  and  state  estimator  are 
optimized  independently  for  a linear  system  with  white  Gaussian  disturb- 
ances, an  overall  optimum  system  results  for  a quadratic  performance 
index  [2].  However,  this  is  not  true  for  nonlinear  systems;  that  is, 
when  estimates  are  used  for  the  state  variables  in  nonlinear  systems, 
the  overall  system  will  not  be  optimal.  Furthermore,  Sworder  [3]  states 
that  the  operations  of  parameter  estimation  and  optimization  cannot,  in 
general,  be  separated  even  for  linear  systems  with  a quadratic  perform- 
ance index.  This  is  due  to  the  inevitable  coupling  which  exists  between 
the  unknown  parameters  and  the  control  law.  Since  the  combination  of 
instantaneously  best  control,  best  state  estimation,  and  best  parameter 
estimation  belongs  to  a class  of  extremely  difficult  unsolved  problems, 
there  is  no  alternative  except  to  use  the  estimates  which  are  available 
for  deriving  the  control. 


Disturbances 


Figure  1.  Block  diagram  of  an  "ideal  adaptive"  system 
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Models 

n 

l When  attempting  to  control  a nonlinear  time- varying  system  in 
some  optimum  fashion,  it  is  always  necessary  to  solve  the  inevitable 
two-point  boundary- value  problem  which  arises.  Although  many  methods 
have  been  devised  for  its  solution,  all  of  them  are  iterative  methods, 
or  search  methods,  which  consume  much  computation  time.  Also,  conver- 
gence difficulties  may  prohibit  satisfactory  results,  particularly  if 
the  system  to  be  controlled  is  of  high  order  or  very  nonlinear.  Thus, 
it  is  often  impossible  to  attempt  on-line  control  of  these  systems.  j 
When  the  system  of  interest  is  linear  time-invariant  many  of 
the  problems  associated  with  computing  the  optimal  control  are  alle- 
viated. Therefore,  when  it  is  necessary  to  control  a nonlinear  adap- 
tive system  in  an  on-line  fashion,  it  is  advantageous  to  use  a linear 
time-invariant  model  whenever  possible.,. 

Suppose  it  is  desired  to  control  a system  which  is  assumed  to 
be  adequately  described  by 


x = f (x,  u,  a,  t) 

a = 0,  x (t  ) = x . 
- - v o'  -o 


(2.1) 


These  vector  differential  equations  express  the  relationship 
between  the  state  x,  an  unknown  constant  parameter  vector  a,  the  con- 
trol vector  u and  time  t.  It  is  also  desired  to  approximate  (2.1)  in 
a fashion  such  that  an  on-line  solution  for  the  control  vector  can  be 
obtained.  Several  subclasses  of  the  identification  problem  can  now 
be  posed. 
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If  the  system  to  be  controlled  possesses  known  dynamics  and  the 
control  enters  in  a linear  fashion  (2.1)  is  the  special  form 

x = g(x,t)  + H(x, t)u(t) . (2.2) 

It  is  then  often  possible  to  adequately  approximate  the  system 
dynamics  (2.2)  by 

x = A[x(t±) jt±]x(t)  + B[x(ti),t±]u(t)  (2.3) 

for  te(t^,t^+^).  In  this  case  identification  is  accomplished  by  meas- 
urement of  the  state  variables  and  computation  of  the  A and  B matrices 
of  (2.3)  [k,5]'  If  measurement  noise  is  present,  of  course,  it  is  neces- 
sary to  filter  the  data  in  order  to  obtain  a best  estimate  x(t)  of  x(t) . 
If  the  control  does  not  enter  linearly,  or,  if  the  system  dynamics  or 
system  model  are  not  precisely  known,  the  above  method  of  identifica- 
tion and  modeling  will  normally  not  be  satisfactory.  In  addition,  even 
if  the  system  dynamics  can  be  represented  as  in  (2.2),  on-line  computa- 
tion of  the  control  vector  may  be  impossible  if  the  dimensionality  of 
the  state  vector  is  high.  In  all  of  these  cases,  identification  of  an 
approximate  system  model  by  other  than  direct  measurement  of  the  state 
vector  at  time  t is  desirable. 

For  the  case  where  the  model  is  to  be  of  the  same  dimensionality 
as  the  state  vector,  (2.1)  is  identified  in  the  form 

x = A(t^)x  + B(ti)u  (2.1;) 

for  te(t^,t^+^)  where  the  A and  B matrices  are  identified  by  some  con- 
venient on-line  technique. 
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In  the  case  -where  the  number  of  state  variables  is  too  large  to 
permit  on-line  control  vector  computation,  a model  of  lower  dimension- 
ality than  the  original  system  (2.1)  is  identified.  Specifically,  the 
model  identified  is 

y = A(t  i)y  + BCt^u  (2.5) 

for  te(t.,ti+1)  where  y(t)  = h[x(t)]  • y(t)  is  of  lower  dimension  than 
x(t)  and  the  A and  B matrices  are  identified  as  functions  of  the  control 
variable  and  the  new  state  variable. 

It  should  be  stressed  that  the  choice  of  models  is  very  impor- 
tant and  should  thus  be  chosen  judiciously.  Before  starting  an  analy- 
sis, it  is  essential  to  examine  most  carefully  the  assumptions  that 
are  being  made  and  the  results  expected.  Quite  often  a simple  model 
should  be  chosen  initially  and  as  more  is  learned  about  the  system  and 
its  behavior,  the  model  can  be  modified.  Although  the  model  may  be  an 
approximation,  it  may  prevent  more  drastic  approximations  being  made  in 
the  solution  of  the  associated  equations  [6]. 

Identification  and  State  Estimation 

Parameter  and  state  estimation  may  both  be  considered  in  the 
realm  of  the  process  of  estimation,  the  process  of  making  a decision, 
or  judgment,  concerning  the  approximate  value  of  certain  undefined  ob- 
jects when  the  decision  is  weighted,  or  influenced,  by  all  available 
information.  Most  methods  which  have  been  proposed  for  state  and  para- 
meter estimation  have  a common  failing  in  that  they  are  not  generally 
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suited  for  on-line  computation.  One  exception  to  this  is  the  sequential 
estimation  scheme  developed  by  Detchmendy  and  Sridhar  [7]  and  presented 
in  this  chapter.  This  scheme  is  an  extension  of  earlier  work  done  by 
Bellman  and  Kalaba  [8,9]  and  has  been  modified  for  use  with  discrete 
systems  by  Sage  and  Masters  [10,11].  It  is  particularly  suited  for  on- 
line computation  and  also  has  the  advantage  of  performing  both  state 
and  parameter  estimation  simultaneously,  as  shown  in  the  examples  of 
Chapter  V. 

Consider  the  class  of  systems  defined  by 


x = g(t,x)  + k(t,x)w 
z = h(t,x)  + v 


(2.6) 


where  x is  the  "generalized"  state  vector  of  dimension  n and  includes 
the  unknown  parameter  vector  which  has  been  adjoined  to  the  original 
state  vector.  Also, 

g(t,x)  = n vector  function 
k(t,x)  = n x p vector  function 
w = p vector  unknown  input 

h(t,x)  = m vector  function 

z = m vector  output 

v = m vector  measurement  error. 

It  is  desired  to  find  the  least  square  estimate  of  x,  designated  x, 
which  minimizes  the  cost  function 


j'  = Jf  [||z  - h(t,x)||1  + |Jx  - g(t,x)||^>]  dt 


(2.7) 
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or,  alternately  written  as 


r [||  ill2 

o 


¥1 


-l'k/¥2k-'  dt 


(2.8) 


where  ¥1  and  ¥2  are  weighting  matrices  which  determine  the  relative 
weighting  to  be  placed  on  the  individual  terms  in  the  cost  function. 

The  method  of  least  squares  is  used  primarily  due  to  historical  prec- 
edent. Since  its  discovery  by  Gauss  [12],  it  has  been  used  with  much 
success  on  many  estimation  problems. 

By  writing  the  Hamiltonian  and  making  use  of  the  maximum  prin- 
ciple, a two-point  boundary-value  problem  results  for  which  some  of  the 
boundary  conditions  are  specified  at  t = 0 and  some  at  t = T , where  now 
the  variable  T is  regarded  as  the  independent  variable.  By  further 
utilizing  the  invariant  imbedding  equation,  which  is  derived  in  Appendix 
A,  the  following  set  of  sequential  estimation  equations  are  obtained. 


'x  = g(T,x)  + 2P(T)  H (T,x)  ¥1  [z(T)  - h(T,x)] 
P = g.(T,x)  P + Pg£  (T,x) 

+ 2P[H  ¥1  (z(t)  - h(T,x)}].  P 

+ i k(T,x)  V"1  (T,x)k/  (T,x) 


where 

3g 

^ Bx 


V(T,x)  = k'(T,x)  ¥2  k(T,x) 
P(T)  = n x n matrix. 


(2.9) 


and 


lU 


The  sequential  nature  of  these  estimation  equations  is  brought  out  by 
the  fact  that  T is  a running  variable.  The  derivation  of  these  equa- 
tions is  given  in  detail  in  Appendix  B. 

Summary 

This  chapter  has  presented  various  linear  time-invariant  models 
■which  can  be  used  to  approximate  nonlinear  time-varying  systems. 

A method  has  been  introduced  which  will  identify  these  models  in  a 
stochastic  environment  in  an  on-line  fashion.  These  results  will  be 
combined  with  the  methods  of  control  presented  in  Chapters  III  and  IV 
to  provide  a suboptimal  adaptive  system. 


15 


List  of  References 


1.  Kalman , R.  E.,  "Fundamental  Study  of  Adaptive  Control  Systems," 
Technical  Report  No.  ASD-TR-61-27,  Vol.  1,  Flight  Controls  Labo- 
ratory, Aeronautical  Systems  Division,  Air  Force  Systems  Command, 
Wright-Patterson  Air  Force  Base,  Ohio,  1962. 

2.  Joseph,  P.  D.,  and  J.  T.  Tou,  "On  Linear  Control  Theory,"  AIEE 
Transactions  on  Applications  and  Industry,  Vol.  80,  SeptemberJ  1961. 

3.  Sworder,  D.  D.,  "A  Study  of  the  Relationship  Between  Identification 
and  Optimization  in  Adaptive  Control  Problems,"  Journal  of  the  Frank- 
lin  Institute,  Vol.  28l,  No.  3,  1966. 

h-  Pearson,  J.  D.,  "Approximation  Methods  in  Optimal  Control,"  Journal 

of  Electron  Control,  Vol.  13,  No.  5,  November,  1962. 

Westcott,  J.  H.,  J.  J.  Florentin,  and  J.  D.  Pearson,  "Approximation 
Methods  in  Optimal  and  Adaptive  Control,"  Proceedings  Second  IFAC 
Conference,  Zurich,  1963.  ~ 

Bellman,  R.  E.,  Adaptive  Control  Processes:  A Guided  Tour, 

Princeton,  New  Jersey:  Princeton  University  Press,  1961. 

7.  Detchmendy,  D.  M.  and  R.  Sridhar,  "Sequential  Estimation  of  State 
and  Parameters  in  Noisy  Non-Linear  Dynamical  Systems,"  ASME  Jour- 
nal  of  Basic  Engineering,  Vol.  88,  Series  D,  No.  2,  June,  1966. 

8.  Bellman,  R.  E.,  and  R.  Kalaba,  "On  the  Fundamental  Equations  of 

Invariant  Imbedding,  I,"  Proceedings  of  the  National  Academy  of 
Sciences,  U.  S.  A.,  Vol.  I;7,  1961.  ~ 

9.  Bellman,  R.  E.,  and  R.  Kalaba,  "Dynamic  Programming,  Invariant 

Imbedding,  and  Quasilinearization:  Comparisons  and  Inter- 

connections," Rand  Corporation  Report  RM-i|038-PR,  March,  1961±. 

10.  Sage,  A.  P.,  and  G.  ¥.  Masters,  "On-Line  Estimation  of  States  and 
Parameters  for  Discrete  Non-Linear  Dynamic  Systems,"  Proceedings 
of  the  National  Electronics  Conference,  Vol.  22,  1966"! 

11.  Sage,  A.  P.,  and  G.  ¥.  Masters,  "Identification  and  Modeling  of 
States  and  Parameters  of  Nuclear  Reactor  Systems,"  IEEE  Transac- 
tions  on  Nuclear  Science,  February,  1967. 

12.  Gauss,  Karl  F.,  Theory  of  the  Motion  of  the  Heavenly  Bodies  Mov- 
ing About  the  Sun  in  Conic  Sections,  New  York:  Dover  Publications. 
Inc.,  1963. 


CHAPTER  III 


SUBOPTIMAL  ADAPTIVE  REGULATOR  CONTROL 


When  a system  undergoes  unforeseen  changes  or  disturbances,  it 


is  impossible  to  derive  a precalculated  control  which  will  force  the 
system  to  operate  in  some  optimal  fashion  for  all  time.  It  then  becomes 
necessary  to  monitor  the  system,  and  as  the  system  deviations  are  noted, 
the  control  law  can  be  altered  in  such  a way  as  to  maintain  an  accept- 
able system  performance.  Thus  the  overall  system  can  be  regarded  as 
adaptive.  Although  there  is  no  definition  at  present  for  an  adaptive 
system  which  meets  with  general  acceptance,  Kalman  [1]  gives  the 
following  definition: 


change  its  control  law  as  a result  of  measured  changes  of  the 
control  object  and  its  environment  and  in  such  a way  as  to 
operate  at  all  times  in  an  optimal  or  nearly  optimal  fashion. 

The  successful  operation  of  an  adaptive  control  system  will 
depend  on  the  estimation  of  the  state  variables  and  the  identification 
of  the  system  dynamics.  This  is  the  problem  considered  in  Chapter  II. 

In  this  chapter  it  is  desired  to  introduce  a feasible  method  for  the 
calculation  of  an  effective  control  law.  The  derivation  of  this  method 
results  from  an  emphasis  being  placed  on  the  reduction  of  complex  cal- 
culations, thereby  reducing  the  calculation  time.  This  hopefully  allows 
the  controller  to  function  in  an  on-line  fashion. i 


A control  system  is  adaptive  if  it  is  possible  for  it  to 
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Derivation  of  Control  Law 
Assume  a given  nonlinear  system  of  the  form 


X = f (x,  Uj  t) 
X (t  ) = X , 

— v 0 — o 


(3.1) 


where  x is  an  n-dimensional  vector  and  u is  an  r-dimensional  vector. 

It  is  desired  to  control  this  system  over  the  time  interval  te(t  , t^) 
while  minimizing  the  performance  index 

tf 

J = 6[x(tJ]  + f §(x,u,t)  dt.  (3.2) 

_ i 

o 

The  nonlinear  system  is  identified  as  a linear  system 


* 


x = A(ti)  x (t)  + B(t±)  u (t) 


and  the  cost  function  approximated  as 


1 1 1 
J = 2 x' (tf)¥  x (t^)  + 1 J [x'Q  x + u'R  u]  dt 


where 


x = x(t)  - xd 


(3.3) 

-X 


(3.1;) 


(3.3) 


and  where  x^  represents  the  desired  final  state  of  the  system  at  time 
t^,.  The  sequential  suboptimal  control  is  obtained  by  first  dividing 
the  time  interval  of  control  into  N subintervals  (t.  - t.)  where 

t €(t  ,t  ) and  i = 0,1,...,  N - 1.  The  nonlinear  system  (3.1)  is 
identified  as  a linear  system  (3.3)  over  the  interval  te(t^,t^). 
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An  optimum  control  is  then  found  which  will  .minimize,  not  the  given 
cost  function  (3.2),  but  a related  cost  function. 


J = 2 X/(tf)¥  x (tf)  + i f C(t)[x'Q  x + u'  Ru]  dt  . 


1 

'f'  ' 2 


(3.6) 


t. 

x 


Once  the  control  is  found,  it  is  applied  over  the  time  interval 

te(t^,t^+^).  It  is  significant  to  note  that  even  though  the  A and  B 

matrices  are  identified  at  time  t. , based  on  information  up  to  time  t. , 

x x 

A and  B are  assumed  to  remain  constant  throughout  te(t^,t^).  This  is 
not  unreasonable  since  C(t),  a weighting  term,  may  be  selected  such  as 
to  offset  any  error  introduced  by  the  assumption. 

The  choice  of  C(t)  is  thus  very  important.  Although  (3.3)  is 
linear  time-invariant,  the  inclusion  of  C(t)  in  (3.6)  will,  in  general, 
lead  to  a two-point  boundary-value  problem  for  which  the  canonic  equa- 
tions are  time  varying.  These  can  be  solved,  but  the  solution  time  is 
often  incompatible  with  the  requirement  for  on-line  control.  To  circum- 
vent this  difficulty,  it  is  advantageous  to  let 


C(t)  = e 


tot 


(3.7) 


where  cu  is  some  constant  to  be  determined  by  the  particular  system 
to  be  controlled.  Using  (3.7)  the  resulting  two-point  boundary- value 
problem  can  be  easily  solved. 

To  show  this,  take  the  system 

x = Ax  + Bu 

x(t. ) = x. , 

— x -x 


(3.8) 
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and  the  performance  index 


t. 

1 


(3.9) 


The  maximum  principle  of  Pontryagin  [2]  is  well  suited  for  this  type 
of  problem  and  is  used  here.  Its  application  is  well  known  and  may 
be  found  in  many  references  [3,h,3 ,6] . Therefore  its  proof  is  not 
given  here  and  only  the  results  are  used. 

For  the  system  given  by  (3.8)  and  the  performance  index  given 
by  (3.9)  the  Hamiltonian  is  given  by 


where  \ is  the  n-dimensional  Lagrange  multiplier  vector.  The  optimal 
control  is  found  by  equating 


H = i [x'Q  x + u'R  u]  e030  + \'[Ax  + Bu] 


(3.10) 


(3.11) 


to  give 


(3.12) 


The  canonic  equations  are  given  by 


(3.13) 


m 

dx 


(3. Ill) 


so  that 


x = Ax  + Bu 


(3.15) 


\ = -Q  e^  x - a'X  . 


(3.16) 
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The  endpoint  conditions  on  X are  given  by  the  transversality  condition 


X (tf)  = 0. 


(3.17) 


Create  a a vector  of  degree  n-r  and  define  it  to  be 


r+1 


a = e 


-(l)t 


(3.18) 


n 


Adjoin  this  a 

vector  to 

the 

u 

vector  yielding  a 

This  new  vector  can 

be 

written  as 

u 

-R 

-1! 

1 

0 

B# 

_ 

1 

L_ 

a 

0 

I 

1 

1 

1 

1 

r 

0 i 1 

i 

Let 

-R-1! 

0 

B' 

M = 

1 

L 

1 

* 

0 j 

I 

l 

0 

i 

Now; 

\ 

1 — 1 

II 

u 

_U)t 

a 

X e 


-cot 


(3.19) 


(3.20) 


(3.21) 


and 


1 = M 


-1 


u 


tut  + ujM 


-1 


u 


tot 


a 


(3.22) 
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Using  (3.21)  and  (3.22),  (3.16)  simplifies  to 


r = o x + 0 r 


(3.23) 


where 


(3.210 


n = - m q 

0 = - [M  A'  M-x  - cju! ] . 


By  further  writing 
Bu  = [B  | 0] 


u 

u 

— 

— 

cr 

= C 

a 

— _ 

— — 

C r 


equations  (3.15)  and  (3.16)  become 


(3.25) 


- = Ax  + e r (3.26) 

r = qx  + © r (3.27) 


with 


x(t±) 


(3.28) 


and 


r(tf)  = o . 


(3.29) 
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Having  obtained  this  linear  time-invariant  set  of  equations,  the  solu- 
tion may  be  written  in  the  form 


A 

, c" 

I 

0 

! $_ 

(t  - t±) 


x (t±) 


r (t.) 


= e 


s(ti)(t-ti) 


x (t±) 


r (t±) 


(3.30) 


or, 


X (t) 

X (t±) 

r (t) 

r (t  ) 

- -f-  „ 

x (t) 

$ (t-t.) 

xxi  x 

1 $ „(t-t.) 

i xP  x 

i 

X (t±) 

r (t) 

(t-t.) 

_1XV  x 

I t-PV. 

r (t±) 

r(t  ) is  unknown,  but  can  be  easily  found  from 


(3.31) 


£(t)  = Spx^V  - + (3.32) 

Since 

r(tf)  = o, 

L(\)  = - $-J(t.)  $rr(T±)  x (t.)  (3.33) 

where  It  = t^  - t^  is  the  time  to  go.  Now,  the  first  r components, 
which  comprise  the  control  vector,  can  be  calculated  using  (3.33). 
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For  sufficiently  small  subintervals,  te(4,4+^),  it  is  feasible  to  let 
the  control  remain  constant.  Thus,  it  is  necessary  to  calculate  only 
u(t^).  This  sometimes  necessitates  using  a smaller  subinterval  than 
would  otherwise  be  required.  If  this  does  give  unsatisfactory  results, 
then  there  should  still  be  no  problem  in  operating  on-line  since  the 
control  can  be  computed  in  real  time  using  (3.32)  and  applied  as  it  is 
computed. 

Although  the  derivation  shown  here  assumes  the  performance 
index  given  in  (3.9),  a slight  reformulation  yields  similar  results  for 
the  performance  index  given  in  (3.6). 

Matrix  Exponential 

The  most  prohibitive  factor  in  using  this  scheme  in  an  on-line 
fashion  is  the  calculation  of  the  matrix  exponential  exp  (S(t^) (t^-t^)} 
to  find  the  transition  matrix  $(t^-t^) . This  matrix  exponential  and 
some  of  its  properties  are  discussed  by  Kalman  [1],  along  with  some  of 
the  accuracy  limitations  encountered  in  its  calculation.  Since  it  is 
calculated  by  taking  a finite  number  of  terms  in  a Taylor  series,  cal- 
culation times  become  important  if  too  many  terms  are  needed  before 
the  desired  accuracy  is  reached.  This  of  course  can  occur  when  the 
elements  of  the  matrix  S(t_^)  (t^-4  ) are  large.  One  way  to  circumvent 
this  difficulty  is  to  let 

m -] 

e = eJ 

or 


2k 


s(t±)(tf-t±) 


S(t±) 


(3.3U) 


Now  the  elements  of  the  matrix  used  in  the  Taylor  series  have  been 
reduced  by  a factor  of  m,  thus  allowing  much  faster  convergence  to 
the  correct  answer.  This  results  in  the  root  of  the  transition 
matrix,  and  is  therefore  multiplied  by  itself  (m-1)  times  to  yield 
the  transition  matrix.  In  general,  considerable  difference  results  in 
calculation  times  even  for  m = 2. 

If  large  subintervals  are  used  such  that  the  control  must  be 
recalculated  every  At  seconds  during  the  subinterval  te(t^,t^+-^),  then 
the  following  property  of  the  transition  matrix,  valid  for  time  invar- 
iant systems,  may  be  used. 

§[n(t-tQ)]  = [ $( t-tQ) ]n  . (3.35) 

This  reduces  the  problem  of  calculating  a matrix  exponential  every 
At  seconds  to  a simple  matrix  multiplication. 


Summary 

This  chapter  has  attempted  to  present  a suboptimal  adaptive 
regulator  control  scheme  for  a nonlinear  system  with  a given  cost  func- 
tion. The  nonlinear  system  is  modeled  by  a linear  time-carying  system 
of  assumed  form.  This  model  is  assumed  stationary  over  subintervals 
of  time  which  allows  a controller  to  generate  a sequential  control  law 
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in  an  on-line  fashion  which  minimizes  an  integral  of  time  weighted 
quadratic  form  of  error  and  control  effort. 

The  emphasis  has  been  placed  on  computational  simplicity  and 
a discussion  has  been  presented  on  the  matrix  exponential  and  its 
calculation. 
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CHAPTER  IV 


SUBOPTIMAL  ADAPTIVE  TRAJECTORY  CONTROL 

In  some  instances  it  may  be  worthwhile  to  force  a system  to 
follow  either  a predetermined  desirable  trajectory  or  possibly  a pre- 
calculated trajectory  which  is  optimal  in  some  sense.  Trajectory 
tracking  methods  are  particularly  suited  to  systems  which  are  diffi- 
cult to  control  or  perhaps  need  to  be  constrained  within  certain  limits. 
However,  this  form  of  control  may  result  in  a somewhat  higher  overall 
cost  since  it  does  not  allow  the  system  to  adapt  to  a new  trajectory 
when  environmental  changes  or  noise  are  present. 

Several  schemes  have  been  developed  for  accomplishing  tra- 
jectory control.  For  example,  the  control  of  a typical  lifting  re- 
entry vehicle  about  a nominal  trajectory  has  been  investigated  by 
Kovatch  [1]  using  linearized  equations  about  the  nominal  trajectory 
and  a quadratic  cost  function.  This  particular  method  requires  the 
storage  of  the  precomputed  nominal  trajectory  and  a set  of  precomputed 
feedback  gains.  The  controller  acts  on  deviations  of  the  state  var- 
iables from  this  nominal  trajectory  by  using  the  precalculated  gains. 
This  is  essentially  a form  of  open  loop  control  and  is  not  desirable 
when  the  system  may  be  unknown  a priori.  Also,  large  memory  require- 
ments are  needed  which  is  typical  for  most  trajectory  tracking  schemes. 
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Breakwell,  Speyer,  and  Bryson  [2]  propose  a method  of  control 
which  minimizes  a terminal  quantity  while  satisfying  specified  terminal 
conditions  in  the  presence  of  small  disturbances.  The  scheme  is  based 
on  a linear  perturbation  from  a nominal  optimum  path  and  involves  the 
use  of  the  second  variation  of  the  calculus  of  variations. 

The  method  presented  in  this  chapter  insures  acceptable  per- 
formance if  the  system  model  is  well  defined  and  enough  points  are 
taken  along  the  desired  trajectory.  The  overall  system  is  adaptive 
since  the  system  model  is  identified  in  a sequential  manner,  and  an 
appropriate  control  is  then  determined  which  forces  the  system  to  oper- 
ate on  or  near  the  desired  trajectory.  It  should  be  noted  that  the 
emphasis  is  placed  on  the  simplicity  of  the  control,  the  ease,  and 
speed  with  which  it  may  be  computed,  and  possible  minimization  of  mem- 
ory requirements. 


Derivation  of  Control  Law 

Assume  a system,  such  as  (3.1) * which  is  to  track  a given 
trajectory  over  the  interval  te(t  ,t^.)  . The  given  trajectory  may  be 
one  which  is  dictated  by  certain  problem  constraints  or  simply  desired 
responses.  In  other  instances,  it  may  be  the  result  of  the  minimiza- 
tion of  a performance  index  as  in  (3.2).  In  either  case  the  state 
vector  for  the  desired  trajectory  for  t = t^,  •••}  t^  is  stored 

for  future  use  and  designated  x^(t.).  If  the  time  increments 

At.  = t.  , - t.  are  equal,  mAt  = t „ - t = mAt. . 
l l+l  i ^ 3 o £ o l 
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For  the  suboptimal  control  scheme  the  system  is  identified  as 


i = A(t±)  x + B(t. ) u 


(b.l) 


where  A and  B are  constant  matrices  over  the  interval  te(t  t ) 

i5  i+1'* 

For  each  subinterval,  a cost  function  of  the  form 


Vi  ’ W - 2d(ti+1)]'  PU(ti+1)  - *j(t  )] 


t. 

1+1 


“ f u'  R(t±)  u dt 


t. 

i 


Oi.2) 


is  chosen.  The  elements  of  P determine  how  closely  the  predetermined 
trajectory  should  be  followed.  The  total  cost  for  the  time  interval 
te(to,t^.)  is  then 


n-1 

cp  = E V. 
i=0  1 


(b.  3) 


At  each  t - t^  the  two- point  boundary- value  problem  must  be  solved, 
at  least  for  the  initial  values  of  the  control.  As  mentioned  in 
Chapter  III,  if  the  subinterval  length,  teft^t^),  is  small,  the 
control  can  be  held  constant  over  the  entire  subinterval  with  essen- 
tially  the  same  result  as  if  it  is  varied. 

lhe  suboptimal  control  can  be  found  using  the  maximum  principle 
of  Pontryagin  [3].  The  Hamiltonian  can  be  written  as 


H = ^ u'  Ru  + A/[A(t. ) x + B(t. ) u] . 


(b.b) 


For  simplicity,  it  is  desirable  to  write  A(t.^)  and  B(t. ) as  A and  B 
since  they  are  constant  matrices  over  each  subinterval.  The  canonic 
equations  yield. 
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u = -R-'1  B'  \ 
x = A x + B u 

x.  = ) 

x(t. ) = x. 

— 1 -x 

and  the  transversality  condition  at  t = t^+^  gives 


(4.5) 

(4.6) 

(4.7) 

(4.8) 


MW  = P[MW  - 2d(ti+i)]- 


(4.9) 


It  appears  desirable  to  convert  the  equations  (4.5)  through 
(i 1.9)  into  equations  for  which  the  control  can  be  computed  directly. 
With  this  in  mind,  define  an  n-r  dimension  vector  a as  being 


(4.10) 


Adjoin  this  a vector  to  the  u vector  so  that 


u 

*PQ 

i — 1 

1 

1 

a 

1 

H 

O 

m 


where  I is  an  r x r identity  matrix.  Let 


M = 


0 


I 


(4.H) 


(4.12) 


and 
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r = 


(4.13) 


Thus, 


1 


(U.iU) 


and 


(4.15) 


Then,  equations  (4-6)  and  (I4. . 7 ) become 
x = Ax  + c r 
r = 0 r 

■where 

0 = -M  A'  M"1 

and  C is  an  n x n matrix  defined  by 


(4.16) 

(4.17) 


(4.18) 


C = [B  ! 0]. 


(4.19) 


The  initial  conditions  on  x are  given  by  (4.8),  and  the  endpoint 
conditions  on  F are  given  by 


I<W 


u(t.  , ) 
- i+l'' 
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E<W  ‘ 


-R"1 

B' 

I 

o i 

H 1 
l 1 

P « W 


X ( t . - ) ] 
-cT  i+l'  J 


r(t.+1)  = m P[x(ti+1)  - xd(t.+1)] 


(U-20) 


■where  I is  an  r x r identity  matrix. 

The  solution  of  (4.16),  (4.17),  (4.20)  and  (4.8)  is  given  by 


x(t-t. ) 

- 1 

*A  j c 

x(t.) 

— 

= 

1 

(t-t  ) 

r(t-t±) 

e 

0 i © 

1 -1 

1 

r(t.) 

(4.21) 


■where  T(t^)  is  yet  to  be  found.  To  do  this,  define 


’a  | C 

(t-t.)  = 

e (t-t.) 

XX  x X 

1 

e 

.0  j ©_ 

1 

(t-t. ) 
Pxv  X 

i 

At  t = t.  n,r  is  known  in 
x+1’- 

terms  of  x. 

Utilizing  t 

(4.22) 


r(t  ) is  found  to  be 


-1 


r(t±)  = y“  [if  x(t±+1)  + p xd  (ti+1)] 


(4.23) 


where 


V = P $ (t.  --t.)  - $,-,r(t.  ,-t.) 

Y xr  1+1  i'  rrv  i+l  x 


(4.24) 


and 


ijr  = (t.  n-t.)  - P § (t.  ,-t.), 

fxv  i+l  i'  xxv  i+l  i' 


(4.23) 
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Since  the  first  r components  of  F comprise  the  control  vector,  the 
control  can  be  applied  after  computing  (l;. 23). 

It  is  important  to  note  that  the  given  trajectory  may  be  fol- 
lowed more  closely  by  taking  more  points.  However,  this  has  the  dis- 
advantages of  requiring  more  computation  time  and  also  larger  memory 
storage.  These  may  be  critical  factors  and  thus  require  careful  atten- 
tion. The  situation  can  be  helped  possibly  by  choosing  more  points 
where  the  system  response  is  changing  rapidly  and  less  points  where  the 
system  response  is  changing  slowly. 


Matrix  Inverse 


The  most  time  consuming  calculations  in  this  scheme  are  the 
calculation  of  the  matrix  exponential  given  in  (1+. 22)  3 and  the  cal- 
culation of  the  matrix  inverse  given  in  (U.23).  Suggestions  concern- 
ing the  calculation  of  the  matrix  exponential  are  given  in  Chapter  III . 
Although  many  methods  have  been  devised  for  the  calculation  of  the 
matrix  inverse  [lj],  there  is  no  universal  best  method.  It  can  be  shown 
that  a straightforward  calculation  of  a matrix  inverse  from  its  defin- 
ition, 


-1  _ cofactor  matrix  of  y 

determinant  of  y ’ 


(il.26) 


2 2 

requires  (nl)(n  -n-l)  multiplications  and  n divisions  when  the 
matrix  y is  of  dimension  n [5].  For  large  n,  this  requires  rela- 
tively large  computer  times.  Also,  since  the  elements  of  y are 


the  result  of  physical  measurements,  each  one  will  contain  errors. 

In  addition,  limitations  on  word  length,  or  numerical  accuracy,  of 
computational  methods  or  the  computer  itself  introduces  some  error 
for  each  multiplication  or  division.  Thus,  numerical  accuracy  is  a 
direct  function  of  the  number  of  multiplications  and  divisions  em- 
ployed in  the  inversion  algorithm  used. 

Occasionally,  it  may  happen  that  the  matrix  to  be  inverted  is 
singular,  and  the  inverse  does  not  exist.  However,  the  concept  of 
a "generalized  inverse"  [6]  can  be  introduced  to  sidestep  both  the 
theoretical  and  practical  problems  of  singular  matrices. 

Due  to  the  obvious  difficulties  which  may  be  encountered  when 
obtaining  a matrix  inverse,  care  should  be  taken  in  the  choice  of  , 
a suitable  computational  algorithm.  For  the  examples  given  in 
Chapter  V,  an  algorithm  which  employs  the  Gaussian  elimination  method 
[7]  is  used  and  found  very  efficient  with  respect  to  both  speed  and 
accuracy. 

The  Gaussian  elimination  method  indirectly  gives  the  matrix 
inverse  by  first  solving  the  equation  Ax  = y,  where  A is  an  n x n 
matrix,  x is  an  n vector,  and  y is  an  n vector.  This  equation  is 
solved  by  the  elimination  of  one  unknown  at  a time  through  substitution. 
Note,  however,  that  if  y^  is  defined  as  the  vector  whose  components  are 

all  zero  except  for  the  i"^  component  which  is  equal  to  one,  then  the 

"th.  1 1 

solution  to  Ax^  = y^  gives  the  i0i  column  of  A-  . Therefore,  A" 

is  found  by  solving  the  equation  n times  using  y^,  y^,  ...  , and  y . 
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Summary 

This  chapter  has  attempted  to  present  a method  for  the  on-line 
suboptimal  adaptive  trajectory  control  of  a nonlinear  system.  By  the 
sequential  identification  of  a linear  constant-coefficient  model,  the 
given  system  is  forced  to  track  a predetermined  trajectory  at  certain 
specified  points.  Therefore,  an  acceptable  response  can  be  obtained 
even  though  the  complete  system  dynamics  are  not  known  a priori. 

The  actual  system  response  can  be  forced  to  fit  the  desired 
given  trajectory  more  closely,  simply  by  specifying  more  points  along 
the  desired  trajectory.  However,  this  results  in  larger  computation 
times  and  a need  for  more  storage  requirements. 

A short  discussion  has  been  presented  on  the  calculation  of  the 
matrix  inverse  and  the  difficulties  encountered  in  its  calculation. 
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CHAPTER  V 


APPLICATIONS 

The  purpose  of  this  chapter  is  to  illustrate  the  effectiveness  of 
the  methods  of  control  presented  in  Chapters  III  and  IV.  The  three 
examples  given  are  of  practical  importance  and  of  current  interest. 

They  include  the  startup  of  a nuclear  reactor,  the  orbital  transfer 
of  a vehicle  using  a low  thrust  ion  engine,  and  the  startup  of  a 
nuclear  powered  rocket  engine.  Each  of  the  problems  is  nonlinear,  and 
the  different  methods  for  choosing  a linear  model,  as  outlined  in 
Chapter  II,  are  used.  The  sequential  scheme  used  for  parameter  and 
state  estimation,  which  is  also  mentioned  in  Chapter  H,  proves  to  be 
very  effective  for  these  applications. 

In  each  case,  the  results  are  compared  with  predetermined  "optimum" 
or  desired  results.  However,  due  to  environmental  changes  or  noise  the 
adaptive  controllers  may  substantially  reduce  the  cost  over  that  which 
results  from  a predetermined  control.  Although  input  and  output  noise 
is  injected  in  two  of  the  examples,  no  environmental  changes  are  intro- 
duced so  that  the  effectiveness  of  the  methods  can  be  illustrated  by 
the  comparisons  with  predetermined  results. 

All  calculations  were  made  on  an  IBM-709  digital  computer  at  the 
University  of  Florida  Computing  Center,  with  the  computer  programs 
being  written  in  Fortran  IV. 
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Example  1:  Nuclear  Reactor  Startup 


The  nuclear  reactor  kinetics  equations  [1]  are  given  by 


n = neutron  flux  density 
c = precursor  density 
p = reactivity 

3 = O.OOblj.  = fractions  of  precursors  formed 
A = 0.001  seconds  = neutron  lifetime 
X =0.1  seconds  ^ = precursor  decay  constant. 


It  is  desired  to  find  the  control  which  will  drive  the  neutron  flux 
density  to  a desired  value  while  minimizing  the  integral  of  control, 
or  reactivity,  squared.  That  is,  given  (3.1)  and  (3.2)  with 


n 


(3.D 


(3.2) 


where 


n(0)  = 0.3  k.w. 


(3.3) 


c(0)  = 32.0  k.w. 


find  the  reactivity,  p , such  that 


n(l)  = 3.0 


(3.1;) 


and  the  performance  index 
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dt 


(3.5) 


is  minimized. 

The  problem  is  further  complicated  by  the  presence  of  an  input 
noise,  w,  and  an  output  noise,  v.  The  input  noise,  w,  is  simulated  by 
a sawtooth  waveform  of  zero  .mean,  having  a maximum  magnitude  of  0.0008 
and  a period  of  0.02  seconds,  and  the  output  noise,  v,  is  a similar 
waveform  with  a maximum  magnitude  of  0.25.  Due  to  this  noise  and  a 
reasonable  doubt  as  to  the  validity  of  the  assumptions  made  in  the 
derivation  of  (5-1)  and  (5.2),  it  is  worth-while  to  attempt  an  on-line 
adaptive  control  of  the  reactor  using  the  methods  of  Chapters  III  and  IV. 

The  overall  system  is  shown  in  Figure  2,  where  the  reactor 
dynamics  are  given  by 

n = -6.U  n + 0.1  c + 10^  n(p+w)  (5.6) 

c = 6.U  n - 0.1  c (5.7) 

z = n + v . (5.8) 

Note  the  presence  of  the  input  noise,  w,  and  the  output  measurement 
noise,  v.  The  model  used  for  control  at  each  t = t^  is  given  by 

n = ~6.b  n + 10^  n(ti)p  + K(t^)  (5.9) 

where  n(t±)  is  the  estimate  of  n at  t = t^,  or  n(ti),  and  K(t±)  is 
c..,.  unknown  parameter  to  be  identified.  Note  that  the  second-order 


Controller 


n(t±),  £(1^) 


Identification 


Estimation 


P (t±) 


L 


0.1 

Reactor  dynamics  with  noise 


J 


Figure  2.  Adaptive  control  of  nuclear  reactor 


ill 


nonlinear  system  is  reduced  to  a first-order  linear  system.  It  is, 
of  course,  desired  to  find  p(t^). 

The  model  used  for  the  estimation  of  n(t^)  and  K(t^),  used 
in  (5.9),  is  given  by 


h = -6.U  n + 10^  h(p+w)  + K 


(5.10) 


K = 0 


(5.H) 


z = n + v 


(5.12) 


where  it  is  necessary  to  estimate  n and  identify  K,  and  p is  known 
for  all  past  time,  t < t^.  Using  equations  (2.6),  (2.8),  and  (2.9), 
the  following  sequential  estimation  equations  are  derived. 


A = [-6.il  + 103  p]n  + K + 2W1( z-n)  P. 


11 


K = 2Wl(z-n)  P 


21 


^1  2^~6-^  + 10  P^P11  + P12  + P12  “ 2W1  P11  + 2¥2 

& t-6-1* + i°3piq2  - p22  - 2wi  pn  p12 


(5.1 3) 


[-6.5  + 10J  p]P21  + P22  - 2W1  Pn  P 


21 


P22  ~2m  P12  P21  - 


Note  that  P^2  = P^,  an^  "because  the  estimation  problem  is  linear, 
neither  n nor  K appears  in  the  P equations. 
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Regulator  Control 

For  the  regulator  type  of  control  presented  in  Chapter  HI 
a cost  function  of  the  form 

J = i R[n(l)  - 5.0]2  + i J1'0  p2  eU3t  dt  (5.14) 

t. 

i 

is  used.  By  using  equation  (5-9),  the  Hamiltonian  is  written  as 


H = — e^  p2  + \^[-6.4  n + 103  n(t_^)p]  + K(t^). 


(5.15) 


The  canonic  equation s,  (3.1 3)  and  (3.14),  along  with  (3.18) 
through  (3.25) j yield 


n = -6.4  n + 103  n(H)p  + K(t.) 
K = 0 

p = - (uu  - 6.4)p 

cr  = — — £ - u)  cp 

10J  n(t. ) 

i 


(5.16) 


with  the  transversality  condition 

o — cjo  ( 1 . 0—  t . ) 

p Cl)  = -10J  n(t±)R  e 1 [n(l)  - 5.0],  (5.17) 

and  o-(l)  = 0.0. 

By  the  proper  manipulation  of  (5.16),  and  the  substitution  of 
(5.17),  an  analytical  expression  forp(H)  is  found. 


b3 


r 


p ( t . ) 


Ln(ti)-  vx 


K(t . -6.1|(l.0-t. ) K(t.) 


1  *  1 r-'  r\ 

+ — , 5.0 


6 .b 


io3^^)  r -6.Mi.o-t  ) -(co-6.4) (i.o- li0  e 

cu-i2.tr  ^-e  ~e 


+6.1i(1.0-t±) 


R 10^n(t. ) 

i 


(5.18) 


Thus,  due  to  the  low  order  of  the  system  model,  the  calculation  of 
the  matrix  exponential  can  be  waived,  and  the  control  can  be  found  by 
substituting  t^,  n(t^),  and  K(t^)  into  (5.18). 

The  results  of  this  approach  are  given  in  Figures  3,  b and  5, 
with  R = 0.001,  a)  = 3.0,  and  a subinterval  size,  (t^^-t^),  of  0.05. 
The  initial  condition  matrix  for  P is  given  by 


P(0) 


20  5 
5 20 


with  W1  = 10,  and  ¥2  = 10,  n(0)  = 0.5,  and  K(0)  = 3.5. 


Trajectory  Control 

For  the  trajectory  type  of  adaptive  control,  a set  of  points 

are  taken  from  the  predetermined  optimum  trajectory  for  n,  the  neutron 

flux  density.  These  values,  are  equally  spaced  in  time  so 

that  At  = t.  _ -t.  = 0.05  seconds.  The  cost  function  used  is 
i+l  l 

t.  , 

1 91  i+l  o 

V.  = | R[n(t.+1)  - nd(t.+1)]2  + \ / P dt. 

T» . 


(5.19) 


lUi 
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Figure  3.  Nuclear  reactor  startup  using  regulator  control 


kS 


n 


p x 10 


Figure  5.  Reactivity  showing  effect  of  noise 
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Due  to  the  similarity  between  (5-1 9)  and  (5.  ill)  and  since  the  system 
model  is  the  same  as  for  the  regulator  control,  the  result  for  p (t. ) 
is  similar  to  (5.18)  with  n^(l)  =5.0  being  replaced  by  n^(t^+^), 
(1.0- t^)  being  replaced  by  At  = t^  - t. , and  co  being  set  equal 
to  zero.  Therefore, 


P (ti) 


K>  - S “ 


K(t. ) 


-7 — ; — - n ,(t.  , ) 
o7i  dv  i+l' 


(5.20) 


103n(t 


12 


(V  T.-6.U  At  jS.k  AtJ_|~  1.0 


■bH‘ 


- e 


R 10^n( t. ) 

i 


■]  e6'1* 


At 


The  results  are  given  in  Figure  6 with  R = 0.1  and  with  the 
initial  condition  matrix  for  P being 


P(0) 


20  5 

5 20 


Also,  ¥1  = 10,  ¥2  = 10,  n(0)  = 0.5,  and  K(0)  = 3.1;. 


Ii7 


n 


Figure  6.  Nuclear  reactor  startup  using  trajectory  control 
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Example  2:  Low  Thrust  Orbital  Transfer 

As  an  example  of  trajectory  control,  consider  the  problem  of 
minimizing  the  fuel  consumption  of  a low  thrust  rocket  which  is  to 
transfer  from  the  orbit  of  Earth  to  the  orbit  of  Mars  in  fixed  time. 
The  orbits  of  Mars  and  Earth  are  assumed  to  be  circular  and  coplanar, 
and  the  gravitational  attractions  of  the  two  planets  are  neglected. 
The  problem  has  been  previously  formulated  and  solved  for  the  open 
loop  control  assuming  an  inequality  constraint  on  propellant  mass 
flow,  p,  or  thrust  [2].  The  normalized  dynamics  and  boundary  condi- 
tions are  given  by 


w 


v K C . A 

W = — o + “ Sln  0 

r 2 m 

r 


wv  C 

v = + - cos  0 

r m 


m = -3 


(Radial  velocity) 

(Radial  acceleration)  (5.21) 

(Circumferential  acceleration) 

(Mass  flow) 


with 


Ho) 

= 1.0 

r(tf)  ^ 

= 1.52 

w(0) 

= 0.0 

-P 

55 

= 0.00 

v(0) 

= 1.0 

v(tf)  .= 

= 0.81 

m(0) 

= 1.0 

m(tf) 

open 

K = : 

1.00 

C = 

1.872 

"max 

= 0.075 

3 

= 0.0 

h9 


■where  the  final  time,  tp,  is  3.816  units  which  corresponds  to  222.0 

-L 

days  and  0 is  the  thrust  angle  measured  from  the  local  horizontal. 

It  is  desired  to  minimize  the  fuel  consumed  or  equivalently  the  cost 
function 


method  of  quasilinearization  which  show  that  the  open-loop  control  is 
a bang-off  bang  type.  However , in  actual  practice,  due  to  measurement 
errors,  noise,  etc.,  it  may  not  be  desirable  to  apply  this  as  a pre- 
calculated open-loop  control,  especially  since  near  impact  the  system 


points.  For  this  reason,  trajectory  control  seems  feasible  since  it 
monitors  the  system  and  tends  to  keep  it  tracking  the  precalculated 
trajectory,  although  possibly  at  more  cost. 

Since  r and  m are  relatively  small,  it  seems  feasible  to  model 
the  system  at  each  t = t.  with  a second  order  system  of  the  form 


P = -m(tf) . 


Computational  results  have  been  obtained  using  the  iterative 


may  require  more  than  the  open-loop  3 to  match  the  critical  end- 

UlcLX 


1 


(£•22) 


with 


u^  = 8 sin  0 


Ug  = 3 cos  0. 


(£.23) 
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The  total  time  for  the  flight  (222  days)  is  divided  into  37  sub- 
intervals te(t^jt^+^)  where  i = 0,  1,  . ..,  36.  Thus  each  sub- 
interval corresponds  to  six  days.  It  is  desired  to  minimize  the 
performance  index,  given  by 


V.  = 
x 


Rni>(ti+i) 


VW1' 


1 E22tv<ti+1>  " 


Td(W]‘ 


t.  , 

1+1  r, 

r r 2 
J 

t. 

l 


dt 


(5-2U) 


with  - 1000,  - 1000,  and  a = 1.  The  values  for  w^  and  v^ 

are  taken  from  the  optimal  open-loop  trajectory. 

Using  the  maximum  principle,  the  resulting  canonic  equations 

are 


v(t. ) 


w FTTJ  V " 2 


K 


-r  -7- r U, 

(t.)  ”+V  1 


T(ti>  c 

T ‘ - hv7  w * srry  u2 


v(t±) 

\ = HtJ  U2 


v(t±) 

U2  = " rTt“7  V 


(5.25) 


Note  that  (5.31)  can  be  put  in  the  form  of  (J4..I6)  and  (li.l7) 
by  adjoining  the  equation 
s = 0 

to  (5.25)  where  s = - —y- . 

r (t.) 


(5.26) 


The  transversality  condition,  (1|.20),  yields 


(5.27) 


Using  the  matrix  exponential  form  of  solution,  given  by  (1|.21),  (ij.22), 
and  (I4..23),  u^t.)  and  ( t^)  can  be  found. 

The  presence  of  an  input  noise,  I,  and  two  components  of  out- 
put measurement  noise,  N1  and  N2,  which  add  to  w and  v,  respectively, 
requires  that  estimates  of  r,  w,  v,  and  m be  obtained  for  use  in  (5.25) 
and  (5.2?; . I,  Nl,  and  N2^  are  all  chosen  to  be  sawtooth  waveforms  of 
zero  mean,  with  maximum  magnitudes  of  0.005,  0.005,  and  0.05,  respec- 
tively, and  periods  of  5.5  days,  1;.5  days,  and  3.5  days,  respectively. 
Since  the  system  is  a fourth-order  system,  the  P matrix  of  the  estimator 
is  <A  5 x 4 matrix.  Thus,  although  the  P matrix  is  symmetrical,  the 
estimation  scheme  requires  the  simultaneous  solution  of  fourteen  differ- 
ential equations.  For  this  reason  it  seems  feasible  to  use  an  estima- 
tion model  of  the  form 


(5.28) 


v 


w V , , 

• + ([3+1)  cos  9 

r(t-6t)  m(t-6t) 


z1  = w + Ml 


z 


2 


= v + N2. 
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Note  that  since  r and  m are  not  estimated  directly,  they  are  estimated 
by  integrating  w and  -6,  respectively,  and  then  used  in  (5.28).  This 
introduces  a delay  factor  of  6t  seconds  for  these  terms  in  (5.28), 
however,  this  has  little  effect  on  the  results. 

Using  the  estimation  procedure  of  Chapter  II,  (2.9)  gives 


w = 7T-  + — -Spr:  w + 2Pi;l¥1  (zx-w)  + 2P12W1  (Zg-v) 


v = 


r (t-6t) 

r(t-6t)2 

-r  

m(t-6t) 

wv 

C cos  8 

+ 2P21¥1 

r(t-6t) 

m( t-6t) 

— — (P 

r(t-6t) 

21+P12'  “ 

2wi  (p21+: 

2vP 

22 

* P11 

w P 

12 

r( t-6t) 

r(t-5t) 

r(t-6t) 

2’P22 

v P 

11 

wp2i 

r( t-6t) 

r(t-5t) 

r(t-6t) 

22  ' 2~ 


N 1 sin  8 


II  12'  21'  2 ¥2 


(5.2 9) 
sin  8 cos  8 


- 2¥1  (P  P +P  P ) + 

V 11  12  12  22;  W2 


- 2W1(P  P + P P )+  - sin  g cos  0 
1 IF 21  21  22J  2 ¥2 


2 x 1 cos  8 


22  ” r(t-6t)  ^?12+P21')_  r(t-6t)  P22_2W1(P12P2l"  Jr22>  T 2 ¥2  * 


Since  P^  = Pg-^  only  five  equations  must  be  solved.  For  this 
problem  ¥1  = 1.0,  ¥2  = 1.0,  w(0)  = w(0)  = 0.0,  v(0)  = v(0)  = 1.0,  and 
P(0)  is  chosen  to  be  the  null  matrix. 


The  results  for  this  scheme  are  compared  with  the  optimal 
results  in  Figure  7,  8,  9,  and  10.  Although  the  thrust  angle,  8, 
shown  in  Figure  8 appears  to  be  considerably  in  error  during  the 
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middle  portion  of  the  flight,  this  is  not  true  since  the  thrust  itself 
is  very  small  over  this  range,  and  the  system  is  only  attempting  to 
remain  on  the  desired  trajectory.  The  suboptimal  closed-loop  system 
reaches  the  desired  value  at  the  final  time,  but  at  a final  mass  of 
0.8536  rather  than  the  open-loop  value  of  0.8595-  In  the  suboptimal 
solution,  no  inequality  thrust  magnitude  constraints  are  present,  and 
the  maximum  thrust  used  is  0.07786  which  compares  very  favorably  with 
the  value  of  0.075  specified  as  an  inequality  constraint  for  the  open- 
loop  solution. 


5k 


V 


w 


Figure  8.  Radial  velocity 
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Optimum 

Suboptimum 


l_  JNote : p « O 

™ in  this  region 

Figure  10.  Thrust  angle,  9 


Example  3:  Nuclear  Rocket  Engine  Startup 


Rapid  startup  of  a nuclear  rocket  engine  is  necessary  to  con- 
serve propellant,  and  to  reduce  the  complexity  of  attitude  control 
problems  [3].  For  a rapid  startup,  however,  a nuclear  rocket  engine 
is  difficult  to  control,  particularly  since  there  is  disagreement  as 
to  the  system  dynamics,  and  also  the  effect  of  gravity  upon  flow  and 
heat  transfer  is  not  completely  determined.  For  these  reasons,  it 
appears  that  perhaps  an  on-line  method  of  control  is  necessary. 

Consider  the  normalized  system  dynamics  as  given  by  Smith  and 
Stenning  [1+], 


n 


50  (pn  - n + c) 


c 


0.1  (n  - c) 


T =1.1+71  (n  - PT1/2 


(5-30) 


2 


T 


- T + 18  | 


(5.31) 


where 


n 


neutron  flux  density 


c 


precursor  density 


T = maximum  core  surface  temperature 


P = core  inlet  stagnation  pressure 
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p = reactivity 

Pc  = control  poison  reactivity. 

It  is  possible  to  model  (5.30)  at  each  t = t.  with  a linear 

i 

model  of  the  form 

n = -50n  + 50c  + 50n(t. ) p 
c = O.ln  - 0.1c 

(5.32) 

T = l.ltfln  - l.li71T1/2(t.  ) 

x 

• r i/9  P(t. ) — , 

P = 0.h\  0.915  T1/2(t  ) - — y t j p . 

L x T1/d(t.)  J 

x 

It  is  desired  to  find  a value  of  reactivity,  p,  at  t = t 
which  will  minimize  the  cost  function 

\+l 

Vi  ' I “ [T(ti+1>  - Td(ti+l)]2  * \ 1 1 °2  dt  (5.33) 

"b  . 
x 

where  the  values  for  are  taken  from  a desired  trajectory. 

Following  the  formulation  given  in  Chapter  IV,  the  canonic 
equations  are  obtained  from  the  maximum  principle,  yielding  (5.32) 
along  with 
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p - -5n(t^)  cr^  - 73.55n(ti)  a ^ + 50  P 


0.1  a 


cr 


2 


= 0 


(5.3U) 


a.  = l.ii71  T1/2(t.)  ct0 
3 x 2 


>.k  n 


°-91S  Tl/2(ti>  - 372 


p(ti) 

JQ 


(ti) 


3' 


At  t - t^,  n,  c,  Tj  and  P are  known  to  be 


n (t±) 
c (t±) 

T (t±) 

P (t±) 

and  at  t = t.  , , 
x+1 3 


» (titl)  - o 
CTi  (ti+i>  ' 0 


°3  (W  ' 0 • 


(5.35) 


(5.36) 


The  transversality  condition  (!|.20)  yields 


°2<W  ‘ ' “ IT<W  - Td(ti+1))' 


(5.37) 
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Using  the  matrix  exponential  form  of  solution,  given  by  (l;. 21),  (U.22), 
and  (4.23),  p(t.)  can  be  found.  Then,  p (t.),  which  corresponds  to 
control  rod  movement,  can  be  found  using  (5.31). 

This  procedure  is  utilized  in  tracking  several  desired  temper- 
ature trajectories  given  in  Figure  11.  The  corresponding  curves  for  the 
control  poison  reactivity  are  given  in  Figure  12.  The  sampling  inter- 
val, At  = t^+-j-  - t_^,  varies  from  0.01  seconds  to  0.05  seconds  depending 
upon  the  accuracy  desired  and  the  particular  case.  Note  that  the  use 
of  this  method  allows  the  generation  of  the  control  necessary  to  main- 
tain a steady- state  condition.  It  can  be  shown  that  eventually  the 
system  will  reach  steady- state  values  of 


c 


n 


ss 


ss 


(5.38) 


P 


0 


ss 


and 


where  the  subscript  ss  denotes  steady-state  and  T,  denotes  the 

dss 


desired  steady- state  value  for  the  temperature. 
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In  Figure  13,  an  optimum  temperature  trajectory  is  tracked 
using  a sampling  interval  of  0.025  seconds.  The  optimum  trajectory 
results  from  the  control  of  the  system  (5*30)  over  the  time  interval 
te(0,10)  while  specifying  T(10)  = 0.1;  and  minimizing  the  cost  function 

1 10  2 

J = 2 J\  p dt>  (5.39) 

0 

The  other  state  variables,  n,  c,  and  P are  given  in  Figure  ll;,  with 
the  total  reactivity,  p,  and  the  control  poison  reactivity,  p being 
given  in  Figures  15  and  16,  respectively. 
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Figure  11.  Desired  temperature  trajectories 
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CO 


Figure  12.  Control  poison  reactivity 
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Figure  15 . Total  reactivity  for  optimum  temperature  trajectory 


0 2.0  h.O  6.0  8.0  10.0 


Figure  16.  Control  poison  reactivity  for  optimum  temperature 
trajectory 
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CHAPTER  VI 


CONCLUSIONS  AND  RECOMMENDATIONS 

The  results  presented  in  this  dissertation  constitute  the 
development  of  two  methods  of  control  for  continuous  nonlinear  sys- 
tems which  appear  feasible  for  use  in  an  on-line  fashion.  These 
methods  of  control  are  combined  with  a method  of  state  and  parameter 
estimation  to  yield  an  adaptive  system  which  attempts  to  operate  in 
some  optimal  fashion  or  to  track  an  acceptable  trajectory.  The  empha- 
sis in  the  derivation  of  these  control  laws  is  placed  on  the  simplic- 
ity of  the  formulation  and  the  speed  and  ease  at  which  the  control  can 
be  calculated  and  applied. 

The  development  of  the  suboptimal  adaptive  regulator  scheme 
presented  in  Chapter  III  utilizes  the  identification  of  a linear 
constant-coefficient  system  at  discrete  instants  of  time  and  the 
calculation  of  a control  which  minimizes  a time  weighted  quadratic 
performance  index  over  the  remaining  time  to  go.  Since  this  method 
allows  the  system  to  adapt  to  new  trajectories,  it  is  possible  to 
reduce  the  cost  over  that  which  would  result  from  the  application  of 
an  open-loop  control,  based  on  dynamics  that  are  not  completely  known 
a priori. 

The  suboptimal  adaptive  trajectory  control  scheme  presented 
in  Chapter  IV  is  also  based  on  the  identification  of  a linear 
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constant-coefficient  model  at  discrete  instants  of  time.  The  control 

is  then  calculated  in  such  a way  as  to  force  the  system  to  track  some 
predetermined  desired  trajectory.  This  usually  insures  an  acceptable 
response  even  though  it  may  not  be  optimal. 

Applications  of  these  two  methods  of  control  are  given  in 
Chapter  V.  The  problems  considered  are  nuclear  reactor  startup,  low 
thrust  orbital  transfer,  and  nuclear  rocket  engine  startup.  It  is 
difficult  to  present  a general  approach  which  is  applicable  to  all 
systems.  In  each  instance,  there  must  be  some  previous  knowledge  of 
the  system,  and  preliminary  decisions  must  be  made  concerning  the 
type  of  model  to  be  used,  sampling  interval  size,  weighting  factors, 
etc.  These  decisions  must  be  based  on  trial  and  error  procedures  along 
with  a good  understanding  of  the  results  to  be  obtained. 

During  the  development  of  the  material  presented  in  this  dis- 
sertation, many  problems  were  encountered  which  deserve  further 
investigation.  Among  these  w$re: 

1.  Stability  problem.  The  utilization  of  these  methods  of 
control  can  present  certain  stability  problems  which  cannot  be  readily 
analyzed.  These  problems  are  usually  the  result  of  either  inaccurate 
system  models,  large  sampling  intervals,  or  both.  For  example,  when 
using  the  method  of  trajectory  control,  the  system  response  in  certain 
cases  can  be  made  to  oscillate  and  eventually  go  unstable  if  the 
sampling  interval  is  too  large.  This,  of  course,  results  from  the 
system  continuously  over compensating.  Although  this  problem  was  evi- 
dent in  the  examples  used  in  Chapter  V,  it  was  successfully  eliminated 
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by  choosing  smaller  sampling  intervals.  More  investigation  is  neces- 
sary to  determine  the  relationship  between  instability,  sampling  inter- 
val size,  and  model  errors. 

3 • Prediction  of  system  dynamics.  In  each  of  the  methods  of 
control  presented,  a model  was  chosen  at  t = t^  based  on  information 
at  t = f . However,  no  use  was  made  of  the  past  history  of  the  system 
to  possibly  obtain  a better  estimate  of  the  system  dynamics  over  the 
next  period  of  interest.  To  be  more  specific,  perhaps  a method  of 
learning,  averaging,  or  extrapolation,  could  be  used.  This  would  be 
particularly  advantageous  for  the  regulator  method  of  control  where 
the  use  of  a linear  constant-coefficient  system  is  generally  less 
accurate  than  for  the  trajectory  method  of  control. 

3.  Minimization  of  storage  space  for  trajectory  method.  Since 
storage  space  is  often  excessive  for  trajectory  methods,  more  investi- 
gation should  be  conducted  to  aid  in  the  reduction  of  storage  require- 
ments. One  possibility  would  be  the  use  of  small  sampling  intervals 
when  uhe  system  is  responding  quickly  and  large  sampling  intervals 
otherwise  [1], 

Search  method  for  finding  optimal  controls.  It  is,  in 
general,  very  difficult  to  find  optimal  controls  for  nonlinear  systems. 
However,  in  some  instances,  an  approximation  of  the  optimal  control  can 
be  found  off-line  by  using  either  of  the  methods  of  control  presented 
in  this  dissertation.  This  involves  the  use  of  different  weighting 
factors  in  the  regulator  method,  and  different  desired  trajectories  in 
the  trajectory  method.  Although  this  is  trial  and  error,  it  may  be 
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possible  to  combine  these  methods  with  existing  methods,  such  as  gra- 
dient techniques,  in  order  to  attain  a more  sophisticated  search  method. 
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APPENDIX  A 

DERIVATION  OF  THE  INVARIENT  IMBEDDING  EQUATION 


For  the  derivation  of  the  invarient  imbedding  equation  consider 
the  two-point  boundary-value  problem  described  by  the  vector  differen- 
tial equations 


x = f (x,  y,  t) 
y = g (x,  y,  t) 


(A.l) 


where  x and  y are  n-dimensional  vectors.  The  boundary  conditions 
are  given  by 


y(0)  = a 
y(T)  = b 


(A. 2) 


with  the  process  starting  at  t = 0 and  ending  at  t = T.  Let 


x(T)  = r (C,  T) 


(A. 3) 


where 


y(T)  = C.  (A.l;) 

With  C and  T regarded  as  independent  variables^  write 

r (C  + AC,  T + AT)  = r (C,  T)  + f (r,  C,  T)  AT  + 0(A2)  (A. 5) 
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■where 


Lin  2lA2  = o . 
A -*  0 A 


(A. 6) 


The  left  side  of  (A. It)  can  be  expanded  in  a Taylor  series  to 


yiwld 


dr  3r  9 

r (C  + AC,  T + AT  = r (C,  T)  + — AC  + — AT  + 0(A).  (A. 7) 

oO  oT 


From  (A.l)  and  (A.U)  write 


AC  = g (r,  C,  T)  + 0(A2).  (A. 8) 

Then,  by  equating  the  right-hand  sides  of  equations  (A. 6)  and 
( A . 7 ) , and  substituting  (A. 8)  for  AC,  the  equation 

dr  dr 

af  + df  £ T)  = - ->  T)  (a. 9) 

results.  This  is  a partial  differential  equation  which,  with  the 
proper  conditions  on  r,  governs  the  dependence  of  the  missing  terminal 
conditions  on  x as  a function  of  the  duration  of  the  process  and  the 
terminal  conditions  on  y. 


APPENDIX  B 


DERIVATION  OF  SEQUENTIAL  ESTIMATOR  EQUATIONS 


Consider  the  class  of  systems  defined  by 


x = g (t,  x)  + k (t,  x)  w 


z = h (t3  x)  + v 


where 


g (t,  x) 
k (t3  x) 
w 

h (t,  x) 


= n vector  function 
= n x p vector  function 
= p vector  unknown  input 
= m vector  function 
= m vector  output 
= m vector  measurement  error. 


It  is  desired  to  find  the  least  square  estimate  of  x,  designated 
which  minimizes  the  cost  function 

j/  = f Ul  z - + II  x - g(t,x)  |£2j  dt 

o 

or,  alternately  written  as 


T r- 


J'  * f U - 1^1  + Hi  C'¥2k]  dt 


(B.l) 


(B.2) 


(B.3) 
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7b 


where  ¥1  and  W2  are  weighting  matrices  which  determine  the  relative 
weighting  to  be  placed  on  the  individual  terms  of  the  cost  function. 

Utilizing  the  maximum  principle , the  Hamiltonian  is  written  as 


H - ||  z - h(t,x)  11^  + ||  w |£  + \'[g(t,x)  + K(t,x)  w] 


(B.li) 


where  v = k/W2k.  The  canonic  equations  are  given  by 


• _ bh 
- BX 


X = 


bh 

3x 


(B.£) 


with 


Since  T is  fixed,  x(0)  and  x(T)  are  free,  therefore 

X(0)  = 0 
X (T)  = 0. 


(B  .6) 


(B.7) 


It  is  necessary  to  solve  the  two- point  boundary- value  problem  given  by 
(B.5)  and  (B.7).  Consider  the  more  general  problem  of  letting 


X(0)  = 0 
X (T)  = c 


(B.8) 


and  x(T)  = r(C,T) . Then,  r(C,T)  satisfies  the  invarient  imbedding 
equations 


dr 

Br 

BH 

BH 

— 

— 

— (£,C,T) 

= + _ (r,C,T) 

3T 

BC 

3r 

BC 

(B.9) 
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Assume  a solution  of  the  form 


£(£,1)  = x(T)  + P(T)  C 


(B.10) 


where  P(T)  is  an  n x n matrix.  Substitute  (B.10)  into  (B.9)  and 
expand  the  result  about  r(0,T) . After  collecting  terms  of  order 
c\  C^3  and  , equations  for  x and  P can  be  found.  With  the 
P equation  divided  by  C and  with  P replaced  by  -P,  it  is  noted  that 
only  those  solutions  for  which  C = 0 are  of  interest.  Thus  the 
sequential  estimator  equations  become 


x = g(TjX)  + 2P(T)  H (T,x)  W1  [z(T)  - h(T,x)] 

P = g.(T,x)  + P , (T,x)  + 2P  [H  W1  fz(T)  - h(T,x)}]*  P 

gx  “ “ " " 

+|k(T,x)  V_1(T,x)  k'(T,x)  (B.ll) 

where 


3g 


and  [H  W1  (z(t)  - h(T,x)}]^  is  an  n x n matrix. 
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